Overview

Dataset statistics

Number of variables42
Number of observations754472
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory679.6 MiB
Average record size in memory944.5 B

Variable types

Numeric10
Categorical32

Warnings

df_index is highly correlated with building_id and 3 other fieldsHigh correlation
building_id is highly correlated with df_index and 3 other fieldsHigh correlation
district_id is highly correlated with df_index and 3 other fieldsHigh correlation
vdcmun_id is highly correlated with df_index and 3 other fieldsHigh correlation
ward_id is highly correlated with df_index and 3 other fieldsHigh correlation
df_index is uniformly distributed Uniform
df_index has unique values Unique
building_id has unique values Unique
count_families has 70842 (9.4%) zeros Zeros

Reproduction

Analysis started2021-08-09 15:09:59.368134
Analysis finished2021-08-09 15:15:33.869906
Duration5 minutes and 34.5 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIFORM
UNIQUE

Distinct754472
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean381069.7559
Minimum0
Maximum762105
Zeros1
Zeros (%)< 0.1%
Memory size5.8 MiB
2021-08-09T17:15:34.317404image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile38105.55
Q1190567.75
median381073.5
Q3571576.25
95-th percentile724013.45
Maximum762105
Range762105
Interquartile range (IQR)381008.5

Descriptive statistics

Standard deviation219996.1296
Coefficient of variation (CV)0.5773119651
Kurtosis-1.199900889
Mean381069.7559
Median Absolute Deviation (MAD)190504.5
Skewness-4.62529378 × 105
Sum2.875064609 × 1011
Variance4.839829703 × 1010
MonotocityNot monotonic
2021-08-09T17:15:34.461179image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
1155651
 
< 0.1%
1012201
 
< 0.1%
991731
 
< 0.1%
1053181
 
< 0.1%
1032711
 
< 0.1%
1258001
 
< 0.1%
1237531
 
< 0.1%
1298981
 
< 0.1%
1278511
 
< 0.1%
Other values (754462)754462
> 99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
7621051
< 0.1%
7621041
< 0.1%
7621031
< 0.1%
7621021
< 0.1%
7621011
< 0.1%
7621001
< 0.1%
7620991
< 0.1%
7620981
< 0.1%
7620971
< 0.1%
7620961
< 0.1%

building_id
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct754472
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.607553746 × 1011
Minimum1.20101 × 1011
Maximum3.667090013 × 1011
Zeros0
Zeros (%)0.0%
Memory size5.8 MiB
2021-08-09T17:15:34.948080image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1.20101 × 1011
5-th percentile1.255030005 × 1011
Q12.219090006 × 1011
median2.463020003 × 1011
Q33.036080011 × 1011
95-th percentile3.63801001 × 1011
Maximum3.667090013 × 1011
Range2.466080013 × 1011
Interquartile range (IQR)8.169900042 × 1010

Descriptive statistics

Standard deviation5.801936234 × 1010
Coefficient of variation (CV)0.2225049529
Kurtosis-0.08249809674
Mean2.607553746 × 1011
Median Absolute Deviation (MAD)3.980000151 × 1010
Skewness-0.1613184247
Sum1.96732629 × 1017
Variance3.366246407 × 1021
MonotocityNot monotonic
2021-08-09T17:15:35.104647image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2.036010005 × 10111
 
< 0.1%
2.804050128 × 10111
 
< 0.1%
2.048040002 × 10111
 
< 0.1%
2.143080001 × 10111
 
< 0.1%
2.423070114 × 10111
 
< 0.1%
3.108070018 × 10111
 
< 0.1%
3.113090003 × 10111
 
< 0.1%
2.203080012 × 10111
 
< 0.1%
3.642030001 × 10111
 
< 0.1%
2.40401001 × 10111
 
< 0.1%
Other values (754462)754462
> 99.9%
ValueCountFrequency (%)
1.20101 × 10111
< 0.1%
1.20101 × 10111
< 0.1%
1.20101 × 10111
< 0.1%
1.20101 × 10111
< 0.1%
1.201010001 × 10111
< 0.1%
1.201010001 × 10111
< 0.1%
1.201010001 × 10111
< 0.1%
1.201010001 × 10111
< 0.1%
1.201010001 × 10111
< 0.1%
1.201010001 × 10111
< 0.1%
ValueCountFrequency (%)
3.667090013 × 10111
< 0.1%
3.667090013 × 10111
< 0.1%
3.667090013 × 10111
< 0.1%
3.667090013 × 10111
< 0.1%
3.667090012 × 10111
< 0.1%
3.667090012 × 10111
< 0.1%
3.667090012 × 10111
< 0.1%
3.667090012 × 10111
< 0.1%
3.667090012 × 10111
< 0.1%
3.667090012 × 10111
< 0.1%

district_id
Real number (ℝ≥0)

HIGH CORRELATION

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.76821141
Minimum12
Maximum36
Zeros0
Zeros (%)0.0%
Memory size5.8 MiB
2021-08-09T17:15:35.223788image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum12
5-th percentile12
Q122
median24
Q330
95-th percentile36
Maximum36
Range24
Interquartile range (IQR)8

Descriptive statistics

Standard deviation5.807650562
Coefficient of variation (CV)0.2253804298
Kurtosis-0.1225071928
Mean25.76821141
Median Absolute Deviation (MAD)4
Skewness-0.1553353218
Sum19441394
Variance33.72880505
MonotocityNot monotonic
2021-08-09T17:15:35.315471image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
2497071
12.9%
3190076
11.9%
3088219
11.7%
2387842
11.6%
3677317
10.2%
2876369
10.1%
2068035
9.0%
2260050
8.0%
2158026
7.7%
1238955
5.2%
ValueCountFrequency (%)
1238955
5.2%
2068035
9.0%
2158026
7.7%
2260050
8.0%
2387842
11.6%
2497071
12.9%
2876369
10.1%
2912512
 
1.7%
3088219
11.7%
3190076
11.9%
ValueCountFrequency (%)
3677317
10.2%
3190076
11.9%
3088219
11.7%
2912512
 
1.7%
2876369
10.1%
2497071
12.9%
2387842
11.6%
2260050
8.0%
2158026
7.7%
2068035
9.0%

vdcmun_id
Real number (ℝ≥0)

HIGH CORRELATION

Distinct110
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2582.726213
Minimum1201
Maximum3611
Zeros0
Zeros (%)0.0%
Memory size5.8 MiB
2021-08-09T17:15:35.436005image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1201
5-th percentile1208
Q12204
median2410
Q33010
95-th percentile3608
Maximum3611
Range2410
Interquartile range (IQR)806

Descriptive statistics

Standard deviation581.1821751
Coefficient of variation (CV)0.2250266297
Kurtosis-0.1218437698
Mean2582.726213
Median Absolute Deviation (MAD)401
Skewness-0.1563562265
Sum1948594611
Variance337772.7207
MonotocityNot monotonic
2021-08-09T17:15:35.564057image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
310432362
 
4.3%
200515486
 
2.1%
300915076
 
2.0%
280215045
 
2.0%
200114862
 
2.0%
230414431
 
1.9%
231013831
 
1.8%
210513389
 
1.8%
360812969
 
1.7%
241012145
 
1.6%
Other values (100)594876
78.8%
ValueCountFrequency (%)
12014815
 
0.6%
12023674
 
0.5%
12033916
 
0.5%
12043848
 
0.5%
12055215
 
0.7%
12064657
 
0.6%
12077703
1.0%
12085127
 
0.7%
200114862
2.0%
20022991
 
0.4%
ValueCountFrequency (%)
36117131
0.9%
36107821
1.0%
360911616
1.5%
360812969
1.7%
36076219
0.8%
36063648
 
0.5%
36052724
 
0.4%
36046691
0.9%
36037245
1.0%
36024128
 
0.5%

ward_id
Real number (ℝ≥0)

HIGH CORRELATION

Distinct945
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean258278.0626
Minimum120101
Maximum361108
Zeros0
Zeros (%)0.0%
Memory size5.8 MiB
2021-08-09T17:15:35.697540image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum120101
5-th percentile120808
Q1220402
median241004
Q3301006
95-th percentile360803
Maximum361108
Range241007
Interquartile range (IQR)80604

Descriptive statistics

Standard deviation58118.28875
Coefficient of variation (CV)0.2250221647
Kurtosis-0.1218507637
Mean258278.0626
Median Absolute Deviation (MAD)40101
Skewness-0.1563602283
Sum1.948635665 × 1011
Variance3377735487
MonotocityNot monotonic
2021-08-09T17:15:35.823294image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3104052559
 
0.3%
3104042408
 
0.3%
3104122317
 
0.3%
2005062026
 
0.3%
3104111982
 
0.3%
3104191957
 
0.3%
3104151913
 
0.3%
2410041864
 
0.2%
3104091840
 
0.2%
3009031822
 
0.2%
Other values (935)733784
97.3%
ValueCountFrequency (%)
120101598
0.1%
120102564
0.1%
120103435
0.1%
120104558
0.1%
120105437
0.1%
120106638
0.1%
120107420
0.1%
120108301
< 0.1%
120109288
< 0.1%
120110576
0.1%
ValueCountFrequency (%)
361108911
0.1%
361107931
0.1%
361106927
0.1%
361105904
0.1%
361104919
0.1%
3611031096
0.1%
361102817
0.1%
361101626
0.1%
361009994
0.1%
361008825
0.1%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size46.1 MiB
Private
724097 
Public
 
19010
Institutional
 
7744
Other
 
3621

Length

Max length13
Median length7
Mean length7.026789596
Min length5

Characters and Unicode

Total characters5301516
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPrivate
2nd rowPrivate
3rd rowPrivate
4th rowPrivate
5th rowPrivate
ValueCountFrequency (%)
Private724097
96.0%
Public19010
 
2.5%
Institutional7744
 
1.0%
Other3621
 
0.5%
2021-08-09T17:15:36.071172image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:36.155453image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
private724097
96.0%
public19010
 
2.5%
institutional7744
 
1.0%
other3621
 
0.5%

Most occurring characters

ValueCountFrequency (%)
i758595
14.3%
t750950
14.2%
P743107
14.0%
a731841
13.8%
r727718
13.7%
e727718
13.7%
v724097
13.7%
u26754
 
0.5%
l26754
 
0.5%
b19010
 
0.4%
Other values (7)64972
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter4547044
85.8%
Uppercase Letter754472
 
14.2%

Most frequent character per category

ValueCountFrequency (%)
i758595
16.7%
t750950
16.5%
a731841
16.1%
r727718
16.0%
e727718
16.0%
v724097
15.9%
u26754
 
0.6%
l26754
 
0.6%
b19010
 
0.4%
c19010
 
0.4%
Other values (4)34597
 
0.8%
ValueCountFrequency (%)
P743107
98.5%
I7744
 
1.0%
O3621
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
Latin5301516
100.0%

Most frequent character per script

ValueCountFrequency (%)
i758595
14.3%
t750950
14.2%
P743107
14.0%
a731841
13.8%
r727718
13.7%
e727718
13.7%
v724097
13.7%
u26754
 
0.5%
l26754
 
0.5%
b19010
 
0.4%
Other values (7)64972
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII5301516
100.0%

Most frequent character per block

ValueCountFrequency (%)
i758595
14.3%
t750950
14.2%
P743107
14.0%
a731841
13.8%
r727718
13.7%
e727718
13.7%
v724097
13.7%
u26754
 
0.5%
l26754
 
0.5%
b19010
 
0.4%
Other values (7)64972
 
1.2%

count_families
Real number (ℝ≥0)

ZEROS

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9806487186
Minimum0
Maximum11
Zeros70842
Zeros (%)9.4%
Memory size5.8 MiB
2021-08-09T17:15:36.240372image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q31
95-th percentile2
Maximum11
Range11
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4501782998
Coefficient of variation (CV)0.4590617325
Kurtosis15.08826622
Mean0.9806487186
Median Absolute Deviation (MAD)0
Skewness1.496671244
Sum739872
Variance0.2026605016
MonotocityNot monotonic
2021-08-09T17:15:36.341531image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
1637014
84.4%
070842
 
9.4%
239343
 
5.2%
35615
 
0.7%
41204
 
0.2%
5299
 
< 0.1%
6104
 
< 0.1%
727
 
< 0.1%
815
 
< 0.1%
98
 
< 0.1%
ValueCountFrequency (%)
070842
 
9.4%
1637014
84.4%
239343
 
5.2%
35615
 
0.7%
41204
 
0.2%
5299
 
< 0.1%
6104
 
< 0.1%
727
 
< 0.1%
815
 
< 0.1%
98
 
< 0.1%
ValueCountFrequency (%)
111
 
< 0.1%
98
 
< 0.1%
815
 
< 0.1%
727
 
< 0.1%
6104
 
< 0.1%
5299
 
< 0.1%
41204
 
0.2%
35615
 
0.7%
239343
 
5.2%
1637014
84.4%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size43.2 MiB
0.0
663061 
1.0
91411 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2263416
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row1.0
3rd row0.0
4th row0.0
5th row0.0
ValueCountFrequency (%)
0.0663061
87.9%
1.091411
 
12.1%
2021-08-09T17:15:36.551888image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:36.620732image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0.0663061
87.9%
1.091411
 
12.1%

Most occurring characters

ValueCountFrequency (%)
01417533
62.6%
.754472
33.3%
191411
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1508944
66.7%
Other Punctuation754472
33.3%

Most frequent character per category

ValueCountFrequency (%)
01417533
93.9%
191411
 
6.1%
ValueCountFrequency (%)
.754472
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common2263416
100.0%

Most frequent character per script

ValueCountFrequency (%)
01417533
62.6%
.754472
33.3%
191411
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII2263416
100.0%

Most frequent character per block

ValueCountFrequency (%)
01417533
62.6%
.754472
33.3%
191411
 
4.0%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 MiB
0
700266 
1
 
54206

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters754472
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0700266
92.8%
154206
 
7.2%
2021-08-09T17:15:36.803058image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:36.871611image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0700266
92.8%
154206
 
7.2%

Most occurring characters

ValueCountFrequency (%)
0700266
92.8%
154206
 
7.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number754472
100.0%

Most frequent character per category

ValueCountFrequency (%)
0700266
92.8%
154206
 
7.2%

Most occurring scripts

ValueCountFrequency (%)
Common754472
100.0%

Most frequent character per script

ValueCountFrequency (%)
0700266
92.8%
154206
 
7.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII754472
100.0%

Most frequent character per block

ValueCountFrequency (%)
0700266
92.8%
154206
 
7.2%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 MiB
0
728014 
1
 
26458

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters754472
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0728014
96.5%
126458
 
3.5%
2021-08-09T17:15:37.056950image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:37.125546image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0728014
96.5%
126458
 
3.5%

Most occurring characters

ValueCountFrequency (%)
0728014
96.5%
126458
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number754472
100.0%

Most frequent character per category

ValueCountFrequency (%)
0728014
96.5%
126458
 
3.5%

Most occurring scripts

ValueCountFrequency (%)
Common754472
100.0%

Most frequent character per script

ValueCountFrequency (%)
0728014
96.5%
126458
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII754472
100.0%

Most frequent character per block

ValueCountFrequency (%)
0728014
96.5%
126458
 
3.5%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 MiB
0
748236 
1
 
6236

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters754472
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0748236
99.2%
16236
 
0.8%
2021-08-09T17:15:37.528140image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:37.597595image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0748236
99.2%
16236
 
0.8%

Most occurring characters

ValueCountFrequency (%)
0748236
99.2%
16236
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number754472
100.0%

Most frequent character per category

ValueCountFrequency (%)
0748236
99.2%
16236
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
Common754472
100.0%

Most frequent character per script

ValueCountFrequency (%)
0748236
99.2%
16236
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII754472
100.0%

Most frequent character per block

ValueCountFrequency (%)
0748236
99.2%
16236
 
0.8%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 MiB
0
753596 
1
 
876

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters754472
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0753596
99.9%
1876
 
0.1%
2021-08-09T17:15:37.784890image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:37.853520image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0753596
99.9%
1876
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0753596
99.9%
1876
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number754472
100.0%

Most frequent character per category

ValueCountFrequency (%)
0753596
99.9%
1876
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common754472
100.0%

Most frequent character per script

ValueCountFrequency (%)
0753596
99.9%
1876
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII754472
100.0%

Most frequent character per block

ValueCountFrequency (%)
0753596
99.9%
1876
 
0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 MiB
0
754154 
1
 
318

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters754472
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0754154
> 99.9%
1318
 
< 0.1%
2021-08-09T17:15:38.039136image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:38.107796image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0754154
> 99.9%
1318
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0754154
> 99.9%
1318
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number754472
100.0%

Most frequent character per category

ValueCountFrequency (%)
0754154
> 99.9%
1318
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common754472
100.0%

Most frequent character per script

ValueCountFrequency (%)
0754154
> 99.9%
1318
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII754472
100.0%

Most frequent character per block

ValueCountFrequency (%)
0754154
> 99.9%
1318
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 MiB
0
753589 
1
 
883

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters754472
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0753589
99.9%
1883
 
0.1%
2021-08-09T17:15:38.292615image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:38.361199image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0753589
99.9%
1883
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0753589
99.9%
1883
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number754472
100.0%

Most frequent character per category

ValueCountFrequency (%)
0753589
99.9%
1883
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common754472
100.0%

Most frequent character per script

ValueCountFrequency (%)
0753589
99.9%
1883
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII754472
100.0%

Most frequent character per block

ValueCountFrequency (%)
0753589
99.9%
1883
 
0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 MiB
0
754301 
1
 
171

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters754472
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0754301
> 99.9%
1171
 
< 0.1%
2021-08-09T17:15:38.550771image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:38.619230image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0754301
> 99.9%
1171
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0754301
> 99.9%
1171
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number754472
100.0%

Most frequent character per category

ValueCountFrequency (%)
0754301
> 99.9%
1171
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common754472
100.0%

Most frequent character per script

ValueCountFrequency (%)
0754301
> 99.9%
1171
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII754472
100.0%

Most frequent character per block

ValueCountFrequency (%)
0754301
> 99.9%
1171
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 MiB
0
754332 
1
 
140

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters754472
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0754332
> 99.9%
1140
 
< 0.1%
2021-08-09T17:15:38.804986image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:38.873794image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0754332
> 99.9%
1140
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0754332
> 99.9%
1140
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number754472
100.0%

Most frequent character per category

ValueCountFrequency (%)
0754332
> 99.9%
1140
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common754472
100.0%

Most frequent character per script

ValueCountFrequency (%)
0754332
> 99.9%
1140
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII754472
100.0%

Most frequent character per block

ValueCountFrequency (%)
0754332
> 99.9%
1140
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 MiB
0
754398 
1
 
74

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters754472
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0754398
> 99.9%
174
 
< 0.1%
2021-08-09T17:15:39.059358image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:39.128210image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0754398
> 99.9%
174
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0754398
> 99.9%
174
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number754472
100.0%

Most frequent character per category

ValueCountFrequency (%)
0754398
> 99.9%
174
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common754472
100.0%

Most frequent character per script

ValueCountFrequency (%)
0754398
> 99.9%
174
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII754472
100.0%

Most frequent character per block

ValueCountFrequency (%)
0754398
> 99.9%
174
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 MiB
0
751085 
1
 
3387

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters754472
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0751085
99.6%
13387
 
0.4%
2021-08-09T17:15:39.313303image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:39.381945image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0751085
99.6%
13387
 
0.4%

Most occurring characters

ValueCountFrequency (%)
0751085
99.6%
13387
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number754472
100.0%

Most frequent character per category

ValueCountFrequency (%)
0751085
99.6%
13387
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
Common754472
100.0%

Most frequent character per script

ValueCountFrequency (%)
0751085
99.6%
13387
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII754472
100.0%

Most frequent character per block

ValueCountFrequency (%)
0751085
99.6%
13387
 
0.4%

count_floors_pre_eq
Real number (ℝ≥0)

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.087832285
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Memory size5.8 MiB
2021-08-09T17:15:39.445683image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q32
95-th percentile3
Maximum9
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.6551028871
Coefficient of variation (CV)0.3137717966
Kurtosis1.607943784
Mean2.087832285
Median Absolute Deviation (MAD)0
Skewness0.4261028197
Sum1575211
Variance0.4291597927
MonotocityNot monotonic
2021-08-09T17:15:39.535887image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
2463348
61.4%
3165377
 
21.9%
1117726
 
15.6%
46030
 
0.8%
51553
 
0.2%
6329
 
< 0.1%
785
 
< 0.1%
812
 
< 0.1%
912
 
< 0.1%
ValueCountFrequency (%)
1117726
 
15.6%
2463348
61.4%
3165377
 
21.9%
46030
 
0.8%
51553
 
0.2%
6329
 
< 0.1%
785
 
< 0.1%
812
 
< 0.1%
912
 
< 0.1%
ValueCountFrequency (%)
912
 
< 0.1%
812
 
< 0.1%
785
 
< 0.1%
6329
 
< 0.1%
51553
 
0.2%
46030
 
0.8%
3165377
 
21.9%
2463348
61.4%
1117726
 
15.6%

age_building
Real number (ℝ≥0)

Distinct176
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24.33110175
Minimum0
Maximum999
Zeros4693
Zeros (%)0.6%
Memory size5.8 MiB
2021-08-09T17:15:39.662652image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q19
median16
Q327
95-th percentile54
Maximum999
Range999
Interquartile range (IQR)18

Descriptive statistics

Standard deviation65.06845647
Coefficient of variation (CV)2.674291413
Kurtosis204.9754775
Mean24.33110175
Median Absolute Deviation (MAD)9
Skewness13.90794554
Sum18357135
Variance4233.904027
MonotocityNot monotonic
2021-08-09T17:15:39.799530image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1549473
 
6.6%
2046042
 
6.1%
1039369
 
5.2%
2536681
 
4.9%
1236087
 
4.8%
3030546
 
4.0%
528875
 
3.8%
324202
 
3.2%
423184
 
3.1%
723151
 
3.1%
Other values (166)416862
55.3%
ValueCountFrequency (%)
04693
 
0.6%
119174
2.5%
221399
2.8%
324202
3.2%
423184
3.1%
528875
3.8%
619937
2.6%
723151
3.1%
822630
3.0%
914995
2.0%
ValueCountFrequency (%)
9993116
0.4%
200259
 
< 0.1%
1991
 
< 0.1%
1962
 
< 0.1%
1952
 
< 0.1%
1931
 
< 0.1%
1907
 
< 0.1%
1871
 
< 0.1%
1851
 
< 0.1%
18020
 
< 0.1%

plinth_area_sq_ft
Real number (ℝ≥0)

Distinct2123
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean406.6594956
Minimum70
Maximum5000
Zeros0
Zeros (%)0.0%
Memory size5.8 MiB
2021-08-09T17:15:39.942884image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum70
5-th percentile176
Q1280
median358
Q3470
95-th percentile800
Maximum5000
Range4930
Interquartile range (IQR)190

Descriptive statistics

Standard deviation226.7757616
Coefficient of variation (CV)0.557655149
Kurtosis34.2743952
Mean406.6594956
Median Absolute Deviation (MAD)92
Skewness3.8388692
Sum306813203
Variance51427.24607
MonotocityNot monotonic
2021-08-09T17:15:40.073048image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
30027382
 
3.6%
45021086
 
2.8%
40019744
 
2.6%
35018985
 
2.5%
36014825
 
2.0%
25013848
 
1.8%
28013771
 
1.8%
20012326
 
1.6%
32011739
 
1.6%
42010827
 
1.4%
Other values (2113)589939
78.2%
ValueCountFrequency (%)
70111
< 0.1%
714
 
< 0.1%
7256
< 0.1%
7315
 
< 0.1%
744
 
< 0.1%
75101
< 0.1%
762
 
< 0.1%
7726
 
< 0.1%
7837
 
< 0.1%
7916
 
< 0.1%
ValueCountFrequency (%)
50006
< 0.1%
49951
 
< 0.1%
49281
 
< 0.1%
49011
 
< 0.1%
48901
 
< 0.1%
48001
 
< 0.1%
47951
 
< 0.1%
47381
 
< 0.1%
47011
 
< 0.1%
46521
 
< 0.1%

height_ft_pre_eq
Real number (ℝ≥0)

Distinct79
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.04916021
Minimum6
Maximum99
Zeros0
Zeros (%)0.0%
Memory size5.8 MiB
2021-08-09T17:15:40.207216image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile8
Q112
median16
Q318
95-th percentile24
Maximum99
Range93
Interquartile range (IQR)6

Descriptive statistics

Standard deviation5.493172547
Coefficient of variation (CV)0.3422716501
Kurtosis25.72176514
Mean16.04916021
Median Absolute Deviation (MAD)3
Skewness2.493579751
Sum12108642
Variance30.17494463
MonotocityNot monotonic
2021-08-09T17:15:40.339540image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
18101056
13.4%
1494042
12.5%
1280975
10.7%
1673701
9.8%
1559447
 
7.9%
2049607
 
6.6%
2137396
 
5.0%
1029831
 
4.0%
1728309
 
3.8%
1324903
 
3.3%
Other values (69)175205
23.2%
ValueCountFrequency (%)
69471
 
1.3%
717276
 
2.3%
821192
 
2.8%
921672
 
2.9%
1029831
 
4.0%
1111081
 
1.5%
1280975
10.7%
1324903
 
3.3%
1494042
12.5%
1559447
7.9%
ValueCountFrequency (%)
99300
< 0.1%
971
 
< 0.1%
962
 
< 0.1%
952
 
< 0.1%
931
 
< 0.1%
903
 
< 0.1%
891
 
< 0.1%
853
 
< 0.1%
811
 
< 0.1%
803
 
< 0.1%
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size45.1 MiB
Flat
625378 
Moderate slope
104540 
Steep slope
 
24554

Length

Max length14
Median length4
Mean length5.613417065
Min length4

Characters and Unicode

Total characters4235166
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFlat
2nd rowModerate slope
3rd rowFlat
4th rowFlat
5th rowFlat
ValueCountFrequency (%)
Flat625378
82.9%
Moderate slope104540
 
13.9%
Steep slope24554
 
3.3%
2021-08-09T17:15:40.588330image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:40.660298image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
flat625378
70.8%
slope129094
 
14.6%
moderate104540
 
11.8%
steep24554
 
2.8%

Most occurring characters

ValueCountFrequency (%)
l754472
17.8%
t754472
17.8%
a729918
17.2%
F625378
14.8%
e387282
9.1%
o233634
 
5.5%
p153648
 
3.6%
129094
 
3.0%
s129094
 
3.0%
M104540
 
2.5%
Other values (3)233634
 
5.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3351600
79.1%
Uppercase Letter754472
 
17.8%
Space Separator129094
 
3.0%

Most frequent character per category

ValueCountFrequency (%)
l754472
22.5%
t754472
22.5%
a729918
21.8%
e387282
11.6%
o233634
 
7.0%
p153648
 
4.6%
s129094
 
3.9%
d104540
 
3.1%
r104540
 
3.1%
ValueCountFrequency (%)
F625378
82.9%
M104540
 
13.9%
S24554
 
3.3%
ValueCountFrequency (%)
129094
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin4106072
97.0%
Common129094
 
3.0%

Most frequent character per script

ValueCountFrequency (%)
l754472
18.4%
t754472
18.4%
a729918
17.8%
F625378
15.2%
e387282
9.4%
o233634
 
5.7%
p153648
 
3.7%
s129094
 
3.1%
M104540
 
2.5%
d104540
 
2.5%
Other values (2)129094
 
3.1%
ValueCountFrequency (%)
129094
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII4235166
100.0%

Most frequent character per block

ValueCountFrequency (%)
l754472
17.8%
t754472
17.8%
a729918
17.2%
F625378
14.8%
e387282
9.1%
o233634
 
5.5%
p153648
 
3.6%
129094
 
3.0%
s129094
 
3.0%
M104540
 
2.5%
Other values (3)233634
 
5.5%

foundation_type
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size55.5 MiB
Mud mortar-Stone/Brick
622432 
Bamboo/Timber
 
56860
Cement-Stone/Brick
 
38843
RC
 
31819
Other
 
4518

Length

Max length22
Median length22
Mean length20.1705113
Min length2

Characters and Unicode

Total characters15218086
Distinct characters24
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRC
2nd rowMud mortar-Stone/Brick
3rd rowMud mortar-Stone/Brick
4th rowMud mortar-Stone/Brick
5th rowMud mortar-Stone/Brick
ValueCountFrequency (%)
Mud mortar-Stone/Brick622432
82.5%
Bamboo/Timber56860
 
7.5%
Cement-Stone/Brick38843
 
5.1%
RC31819
 
4.2%
Other4518
 
0.6%
2021-08-09T17:15:40.875791image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:40.960204image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
mud622432
45.2%
mortar-stone/brick622432
45.2%
bamboo/timber56860
 
4.1%
cement-stone/brick38843
 
2.8%
rc31819
 
2.3%
other4518
 
0.3%

Most occurring characters

ValueCountFrequency (%)
r1967517
 
12.9%
o1397427
 
9.2%
t1327068
 
8.7%
e800339
 
5.3%
m774995
 
5.1%
/718135
 
4.7%
B718135
 
4.7%
i718135
 
4.7%
n700118
 
4.6%
a679292
 
4.5%
Other values (14)5416925
35.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter11050543
72.6%
Uppercase Letter2165701
 
14.2%
Other Punctuation718135
 
4.7%
Dash Punctuation661275
 
4.3%
Space Separator622432
 
4.1%

Most frequent character per category

ValueCountFrequency (%)
r1967517
17.8%
o1397427
12.6%
t1327068
12.0%
e800339
7.2%
m774995
 
7.0%
i718135
 
6.5%
n700118
 
6.3%
a679292
 
6.1%
c661275
 
6.0%
k661275
 
6.0%
Other values (4)1363102
12.3%
ValueCountFrequency (%)
B718135
33.2%
S661275
30.5%
M622432
28.7%
C70662
 
3.3%
T56860
 
2.6%
R31819
 
1.5%
O4518
 
0.2%
ValueCountFrequency (%)
622432
100.0%
ValueCountFrequency (%)
-661275
100.0%
ValueCountFrequency (%)
/718135
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin13216244
86.8%
Common2001842
 
13.2%

Most frequent character per script

ValueCountFrequency (%)
r1967517
14.9%
o1397427
10.6%
t1327068
 
10.0%
e800339
 
6.1%
m774995
 
5.9%
B718135
 
5.4%
i718135
 
5.4%
n700118
 
5.3%
a679292
 
5.1%
S661275
 
5.0%
Other values (11)3471943
26.3%
ValueCountFrequency (%)
/718135
35.9%
-661275
33.0%
622432
31.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII15218086
100.0%

Most frequent character per block

ValueCountFrequency (%)
r1967517
 
12.9%
o1397427
 
9.2%
t1327068
 
8.7%
e800339
 
5.3%
m774995
 
5.1%
/718135
 
4.7%
B718135
 
4.7%
i718135
 
4.7%
n700118
 
4.6%
a679292
 
4.5%
Other values (14)5416925
35.6%

roof_type
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size57.7 MiB
Bamboo/Timber-Light roof
498705 
Bamboo/Timber-Heavy roof
211626 
RCC/RB/RBC
 
44141

Length

Max length24
Median length24
Mean length23.18091858
Min length10

Characters and Unicode

Total characters17489354
Distinct characters22
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRCC/RB/RBC
2nd rowBamboo/Timber-Light roof
3rd rowBamboo/Timber-Heavy roof
4th rowBamboo/Timber-Heavy roof
5th rowBamboo/Timber-Light roof
ValueCountFrequency (%)
Bamboo/Timber-Light roof498705
66.1%
Bamboo/Timber-Heavy roof211626
28.0%
RCC/RB/RBC44141
 
5.9%
2021-08-09T17:15:41.174567image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:41.268165image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
roof710331
48.5%
bamboo/timber-light498705
34.0%
bamboo/timber-heavy211626
 
14.4%
rcc/rb/rbc44141
 
3.0%

Most occurring characters

ValueCountFrequency (%)
o2841324
16.2%
m1420662
 
8.1%
b1420662
 
8.1%
r1420662
 
8.1%
i1209036
 
6.9%
a921957
 
5.3%
e921957
 
5.3%
/798613
 
4.6%
B798613
 
4.6%
T710331
 
4.1%
Other values (12)5025537
28.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter12785958
73.1%
Uppercase Letter2484121
 
14.2%
Other Punctuation798613
 
4.6%
Dash Punctuation710331
 
4.1%
Space Separator710331
 
4.1%

Most frequent character per category

ValueCountFrequency (%)
o2841324
22.2%
m1420662
11.1%
b1420662
11.1%
r1420662
11.1%
i1209036
9.5%
a921957
 
7.2%
e921957
 
7.2%
f710331
 
5.6%
g498705
 
3.9%
h498705
 
3.9%
Other values (3)921957
 
7.2%
ValueCountFrequency (%)
B798613
32.1%
T710331
28.6%
L498705
20.1%
H211626
 
8.5%
R132423
 
5.3%
C132423
 
5.3%
ValueCountFrequency (%)
/798613
100.0%
ValueCountFrequency (%)
-710331
100.0%
ValueCountFrequency (%)
710331
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin15270079
87.3%
Common2219275
 
12.7%

Most frequent character per script

ValueCountFrequency (%)
o2841324
18.6%
m1420662
9.3%
b1420662
9.3%
r1420662
9.3%
i1209036
7.9%
a921957
 
6.0%
e921957
 
6.0%
B798613
 
5.2%
T710331
 
4.7%
f710331
 
4.7%
Other values (9)2894544
19.0%
ValueCountFrequency (%)
/798613
36.0%
-710331
32.0%
710331
32.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII17489354
100.0%

Most frequent character per block

ValueCountFrequency (%)
o2841324
16.2%
m1420662
 
8.1%
b1420662
 
8.1%
r1420662
 
8.1%
i1209036
 
6.9%
a921957
 
5.3%
e921957
 
5.3%
/798613
 
4.6%
B798613
 
4.6%
T710331
 
4.1%
Other values (12)5025537
28.7%
Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size43.6 MiB
Mud
611997 
RC
72418 
Brick/Stone
65464 
Timber
 
3546
Other
 
1047

Length

Max length11
Median length3
Mean length3.61503409
Min length2

Characters and Unicode

Total characters2727442
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRC
2nd rowMud
3rd rowMud
4th rowMud
5th rowMud
ValueCountFrequency (%)
Mud611997
81.1%
RC72418
 
9.6%
Brick/Stone65464
 
8.7%
Timber3546
 
0.5%
Other1047
 
0.1%
2021-08-09T17:15:41.478469image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:41.552371image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
mud611997
81.1%
rc72418
 
9.6%
brick/stone65464
 
8.7%
timber3546
 
0.5%
other1047
 
0.1%

Most occurring characters

ValueCountFrequency (%)
M611997
22.4%
u611997
22.4%
d611997
22.4%
R72418
 
2.7%
C72418
 
2.7%
r70057
 
2.6%
e70057
 
2.6%
i69010
 
2.5%
t66511
 
2.4%
B65464
 
2.4%
Other values (11)405516
14.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1769624
64.9%
Uppercase Letter892354
32.7%
Other Punctuation65464
 
2.4%

Most frequent character per category

ValueCountFrequency (%)
u611997
34.6%
d611997
34.6%
r70057
 
4.0%
e70057
 
4.0%
i69010
 
3.9%
t66511
 
3.8%
c65464
 
3.7%
k65464
 
3.7%
o65464
 
3.7%
n65464
 
3.7%
Other values (3)8139
 
0.5%
ValueCountFrequency (%)
M611997
68.6%
R72418
 
8.1%
C72418
 
8.1%
B65464
 
7.3%
S65464
 
7.3%
T3546
 
0.4%
O1047
 
0.1%
ValueCountFrequency (%)
/65464
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2661978
97.6%
Common65464
 
2.4%

Most frequent character per script

ValueCountFrequency (%)
M611997
23.0%
u611997
23.0%
d611997
23.0%
R72418
 
2.7%
C72418
 
2.7%
r70057
 
2.6%
e70057
 
2.6%
i69010
 
2.6%
t66511
 
2.5%
B65464
 
2.5%
Other values (10)340052
12.8%
ValueCountFrequency (%)
/65464
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII2727442
100.0%

Most frequent character per block

ValueCountFrequency (%)
M611997
22.4%
u611997
22.4%
d611997
22.4%
R72418
 
2.7%
C72418
 
2.7%
r70057
 
2.6%
e70057
 
2.6%
i69010
 
2.5%
t66511
 
2.4%
B65464
 
2.4%
Other values (11)405516
14.9%

other_floor_type
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.2 MiB
TImber/Bamboo-Mud
482049 
Timber-Planck
122371 
Not applicable
117645 
RCC/RB/RBC
 
32407

Length

Max length17
Median length17
Mean length15.58275986
Min length10

Characters and Unicode

Total characters11756756
Distinct characters26
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNot applicable
2nd rowTImber/Bamboo-Mud
3rd rowTImber/Bamboo-Mud
4th rowTImber/Bamboo-Mud
5th rowTImber/Bamboo-Mud
ValueCountFrequency (%)
TImber/Bamboo-Mud482049
63.9%
Timber-Planck122371
 
16.2%
Not applicable117645
 
15.6%
RCC/RB/RBC32407
 
4.3%
2021-08-09T17:15:41.760131image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:41.837756image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
timber/bamboo-mud482049
55.3%
timber-planck122371
 
14.0%
not117645
 
13.5%
applicable117645
 
13.5%
rcc/rb/rbc32407
 
3.7%

Most occurring characters

ValueCountFrequency (%)
b1204114
 
10.2%
m1086469
 
9.2%
o1081743
 
9.2%
a839710
 
7.1%
e722065
 
6.1%
T604420
 
5.1%
r604420
 
5.1%
-604420
 
5.1%
/546863
 
4.7%
B546863
 
4.7%
Other values (16)3915669
33.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter7937989
67.5%
Uppercase Letter2549839
 
21.7%
Dash Punctuation604420
 
5.1%
Other Punctuation546863
 
4.7%
Space Separator117645
 
1.0%

Most frequent character per category

ValueCountFrequency (%)
b1204114
15.2%
m1086469
13.7%
o1081743
13.6%
a839710
10.6%
e722065
9.1%
r604420
7.6%
u482049
6.1%
d482049
6.1%
l357661
 
4.5%
i240016
 
3.0%
Other values (5)837693
10.6%
ValueCountFrequency (%)
T604420
23.7%
B546863
21.4%
I482049
18.9%
M482049
18.9%
P122371
 
4.8%
N117645
 
4.6%
R97221
 
3.8%
C97221
 
3.8%
ValueCountFrequency (%)
117645
100.0%
ValueCountFrequency (%)
/546863
100.0%
ValueCountFrequency (%)
-604420
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin10487828
89.2%
Common1268928
 
10.8%

Most frequent character per script

ValueCountFrequency (%)
b1204114
11.5%
m1086469
 
10.4%
o1081743
 
10.3%
a839710
 
8.0%
e722065
 
6.9%
T604420
 
5.8%
r604420
 
5.8%
B546863
 
5.2%
I482049
 
4.6%
M482049
 
4.6%
Other values (13)2833926
27.0%
ValueCountFrequency (%)
-604420
47.6%
/546863
43.1%
117645
 
9.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII11756756
100.0%

Most frequent character per block

ValueCountFrequency (%)
b1204114
 
10.2%
m1086469
 
9.2%
o1081743
 
9.2%
a839710
 
7.1%
e722065
 
6.1%
T604420
 
5.1%
r604420
 
5.1%
-604420
 
5.1%
/546863
 
4.7%
B546863
 
4.7%
Other values (16)3915669
33.3%

position
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size50.1 MiB
Not attached
598416 
Attached-1 side
128114 
Attached-2 side
 
26649
Attached-3 side
 
1293

Length

Max length15
Median length12
Mean length12.62052402
Min length12

Characters and Unicode

Total characters9521832
Distinct characters16
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNot attached
2nd rowAttached-1 side
3rd rowNot attached
4th rowNot attached
5th rowNot attached
ValueCountFrequency (%)
Not attached598416
79.3%
Attached-1 side128114
 
17.0%
Attached-2 side26649
 
3.5%
Attached-3 side1293
 
0.2%
2021-08-09T17:15:42.327412image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:42.404208image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
attached598416
39.7%
not598416
39.7%
side156056
 
10.3%
attached-1128114
 
8.5%
attached-226649
 
1.8%
attached-31293
 
0.1%

Most occurring characters

ValueCountFrequency (%)
t2107360
22.1%
a1352888
14.2%
e910528
9.6%
d910528
9.6%
754472
 
7.9%
c754472
 
7.9%
h754472
 
7.9%
N598416
 
6.3%
o598416
 
6.3%
A156056
 
1.6%
Other values (6)624224
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter7700776
80.9%
Uppercase Letter754472
 
7.9%
Space Separator754472
 
7.9%
Dash Punctuation156056
 
1.6%
Decimal Number156056
 
1.6%

Most frequent character per category

ValueCountFrequency (%)
t2107360
27.4%
a1352888
17.6%
e910528
11.8%
d910528
11.8%
c754472
 
9.8%
h754472
 
9.8%
o598416
 
7.8%
s156056
 
2.0%
i156056
 
2.0%
ValueCountFrequency (%)
1128114
82.1%
226649
 
17.1%
31293
 
0.8%
ValueCountFrequency (%)
N598416
79.3%
A156056
 
20.7%
ValueCountFrequency (%)
754472
100.0%
ValueCountFrequency (%)
-156056
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin8455248
88.8%
Common1066584
 
11.2%

Most frequent character per script

ValueCountFrequency (%)
t2107360
24.9%
a1352888
16.0%
e910528
10.8%
d910528
10.8%
c754472
 
8.9%
h754472
 
8.9%
N598416
 
7.1%
o598416
 
7.1%
A156056
 
1.8%
s156056
 
1.8%
ValueCountFrequency (%)
754472
70.7%
-156056
 
14.6%
1128114
 
12.0%
226649
 
2.5%
31293
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII9521832
100.0%

Most frequent character per block

ValueCountFrequency (%)
t2107360
22.1%
a1352888
14.2%
e910528
9.6%
d910528
9.6%
754472
 
7.9%
c754472
 
7.9%
h754472
 
7.9%
N598416
 
6.3%
o598416
 
6.3%
A156056
 
1.6%
Other values (6)624224
 
6.6%
Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size48.8 MiB
Rectangular
723916 
Square
 
17411
L-shape
 
9979
T-shape
 
961
Multi-projected
 
930
Other values (5)
 
1275

Length

Max length31
Median length11
Mean length10.82722222
Min length6

Characters and Unicode

Total characters8168836
Distinct characters32
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRectangular
2nd rowRectangular
3rd rowRectangular
4th rowRectangular
5th rowRectangular
ValueCountFrequency (%)
Rectangular723916
96.0%
Square17411
 
2.3%
L-shape9979
 
1.3%
T-shape961
 
0.1%
Multi-projected930
 
0.1%
Others513
 
0.1%
U-shape447
 
0.1%
E-shape138
 
< 0.1%
Building with Central Courtyard98
 
< 0.1%
H-shape79
 
< 0.1%
2021-08-09T17:15:42.620033image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:42.703752image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
rectangular723916
95.9%
square17411
 
2.3%
l-shape9979
 
1.3%
t-shape961
 
0.1%
multi-projected930
 
0.1%
others513
 
0.1%
u-shape447
 
0.1%
e-shape138
 
< 0.1%
building98
 
< 0.1%
central98
 
< 0.1%
Other values (3)275
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
a1477043
18.1%
e755402
9.2%
r743064
9.1%
u742453
9.1%
t726583
8.9%
l725042
8.9%
c724846
8.9%
n724112
8.9%
g724014
8.9%
R723916
8.9%
Other values (22)102361
 
1.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter7401340
90.6%
Uppercase Letter754668
 
9.2%
Dash Punctuation12534
 
0.2%
Space Separator294
 
< 0.1%

Most frequent character per category

ValueCountFrequency (%)
a1477043
20.0%
e755402
10.2%
r743064
10.0%
u742453
10.0%
t726583
9.8%
l725042
9.8%
c724846
9.8%
n724112
9.8%
g724014
9.8%
q17411
 
0.2%
Other values (9)41370
 
0.6%
ValueCountFrequency (%)
R723916
95.9%
S17411
 
2.3%
L9979
 
1.3%
T961
 
0.1%
M930
 
0.1%
O513
 
0.1%
U447
 
0.1%
C196
 
< 0.1%
E138
 
< 0.1%
B98
 
< 0.1%
ValueCountFrequency (%)
-12534
100.0%
ValueCountFrequency (%)
294
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin8156008
99.8%
Common12828
 
0.2%

Most frequent character per script

ValueCountFrequency (%)
a1477043
18.1%
e755402
9.3%
r743064
9.1%
u742453
9.1%
t726583
8.9%
l725042
8.9%
c724846
8.9%
n724112
8.9%
g724014
8.9%
R723916
8.9%
Other values (20)89533
 
1.1%
ValueCountFrequency (%)
-12534
97.7%
294
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII8168836
100.0%

Most frequent character per block

ValueCountFrequency (%)
a1477043
18.1%
e755402
9.2%
r743064
9.1%
u742453
9.1%
t726583
8.9%
l725042
8.9%
c724846
8.9%
n724112
8.9%
g724014
8.9%
R723916
8.9%
Other values (22)102361
 
1.3%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 MiB
0
722493 
1
 
31979

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters754472
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0722493
95.8%
131979
 
4.2%
2021-08-09T17:15:42.959464image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:43.028242image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0722493
95.8%
131979
 
4.2%

Most occurring characters

ValueCountFrequency (%)
0722493
95.8%
131979
 
4.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number754472
100.0%

Most frequent character per category

ValueCountFrequency (%)
0722493
95.8%
131979
 
4.2%

Most occurring scripts

ValueCountFrequency (%)
Common754472
100.0%

Most frequent character per script

ValueCountFrequency (%)
0722493
95.8%
131979
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII754472
100.0%

Most frequent character per block

ValueCountFrequency (%)
0722493
95.8%
131979
 
4.2%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 MiB
1
603794 
0
150678 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters754472
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row1
ValueCountFrequency (%)
1603794
80.0%
0150678
 
20.0%
2021-08-09T17:15:43.198467image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:43.267140image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
1603794
80.0%
0150678
 
20.0%

Most occurring characters

ValueCountFrequency (%)
1603794
80.0%
0150678
 
20.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number754472
100.0%

Most frequent character per category

ValueCountFrequency (%)
1603794
80.0%
0150678
 
20.0%

Most occurring scripts

ValueCountFrequency (%)
Common754472
100.0%

Most frequent character per script

ValueCountFrequency (%)
1603794
80.0%
0150678
 
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII754472
100.0%

Most frequent character per block

ValueCountFrequency (%)
1603794
80.0%
0150678
 
20.0%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 MiB
0
727966 
1
 
26506

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters754472
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0727966
96.5%
126506
 
3.5%
2021-08-09T17:15:43.452957image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:43.521636image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0727966
96.5%
126506
 
3.5%

Most occurring characters

ValueCountFrequency (%)
0727966
96.5%
126506
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number754472
100.0%

Most frequent character per category

ValueCountFrequency (%)
0727966
96.5%
126506
 
3.5%

Most occurring scripts

ValueCountFrequency (%)
Common754472
100.0%

Most frequent character per script

ValueCountFrequency (%)
0727966
96.5%
126506
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII754472
100.0%

Most frequent character per block

ValueCountFrequency (%)
0727966
96.5%
126506
 
3.5%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 MiB
0
742525 
1
 
11947

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters754472
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0742525
98.4%
111947
 
1.6%
2021-08-09T17:15:43.708019image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:43.776621image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0742525
98.4%
111947
 
1.6%

Most occurring characters

ValueCountFrequency (%)
0742525
98.4%
111947
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number754472
100.0%

Most frequent character per category

ValueCountFrequency (%)
0742525
98.4%
111947
 
1.6%

Most occurring scripts

ValueCountFrequency (%)
Common754472
100.0%

Most frequent character per script

ValueCountFrequency (%)
0742525
98.4%
111947
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII754472
100.0%

Most frequent character per block

ValueCountFrequency (%)
0742525
98.4%
111947
 
1.6%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 MiB
0
737138 
1
 
17334

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters754472
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0737138
97.7%
117334
 
2.3%
2021-08-09T17:15:43.962467image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:44.031556image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0737138
97.7%
117334
 
2.3%

Most occurring characters

ValueCountFrequency (%)
0737138
97.7%
117334
 
2.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number754472
100.0%

Most frequent character per category

ValueCountFrequency (%)
0737138
97.7%
117334
 
2.3%

Most occurring scripts

ValueCountFrequency (%)
Common754472
100.0%

Most frequent character per script

ValueCountFrequency (%)
0737138
97.7%
117334
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII754472
100.0%

Most frequent character per block

ValueCountFrequency (%)
0737138
97.7%
117334
 
2.3%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 MiB
0
700539 
1
 
53933

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters754472
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0700539
92.9%
153933
 
7.1%
2021-08-09T17:15:44.213833image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:44.282789image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0700539
92.9%
153933
 
7.1%

Most occurring characters

ValueCountFrequency (%)
0700539
92.9%
153933
 
7.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number754472
100.0%

Most frequent character per category

ValueCountFrequency (%)
0700539
92.9%
153933
 
7.1%

Most occurring scripts

ValueCountFrequency (%)
Common754472
100.0%

Most frequent character per script

ValueCountFrequency (%)
0700539
92.9%
153933
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII754472
100.0%

Most frequent character per block

ValueCountFrequency (%)
0700539
92.9%
153933
 
7.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 MiB
0
559252 
1
195220 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters754472
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0559252
74.1%
1195220
 
25.9%
2021-08-09T17:15:44.470586image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:44.539367image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0559252
74.1%
1195220
 
25.9%

Most occurring characters

ValueCountFrequency (%)
0559252
74.1%
1195220
 
25.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number754472
100.0%

Most frequent character per category

ValueCountFrequency (%)
0559252
74.1%
1195220
 
25.9%

Most occurring scripts

ValueCountFrequency (%)
Common754472
100.0%

Most frequent character per script

ValueCountFrequency (%)
0559252
74.1%
1195220
 
25.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII754472
100.0%

Most frequent character per block

ValueCountFrequency (%)
0559252
74.1%
1195220
 
25.9%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 MiB
0
693776 
1
 
60696

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters754472
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0693776
92.0%
160696
 
8.0%
2021-08-09T17:15:44.722303image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:44.790917image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0693776
92.0%
160696
 
8.0%

Most occurring characters

ValueCountFrequency (%)
0693776
92.0%
160696
 
8.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number754472
100.0%

Most frequent character per category

ValueCountFrequency (%)
0693776
92.0%
160696
 
8.0%

Most occurring scripts

ValueCountFrequency (%)
Common754472
100.0%

Most frequent character per script

ValueCountFrequency (%)
0693776
92.0%
160696
 
8.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII754472
100.0%

Most frequent character per block

ValueCountFrequency (%)
0693776
92.0%
160696
 
8.0%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 MiB
0
724455 
1
 
30017

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters754472
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0724455
96.0%
130017
 
4.0%
2021-08-09T17:15:44.980122image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:45.052519image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0724455
96.0%
130017
 
4.0%

Most occurring characters

ValueCountFrequency (%)
0724455
96.0%
130017
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number754472
100.0%

Most frequent character per category

ValueCountFrequency (%)
0724455
96.0%
130017
 
4.0%

Most occurring scripts

ValueCountFrequency (%)
Common754472
100.0%

Most frequent character per script

ValueCountFrequency (%)
0724455
96.0%
130017
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII754472
100.0%

Most frequent character per block

ValueCountFrequency (%)
0724455
96.0%
130017
 
4.0%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 MiB
0
742102 
1
 
12370

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters754472
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0742102
98.4%
112370
 
1.6%
2021-08-09T17:15:45.245008image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:45.313724image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0742102
98.4%
112370
 
1.6%

Most occurring characters

ValueCountFrequency (%)
0742102
98.4%
112370
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number754472
100.0%

Most frequent character per category

ValueCountFrequency (%)
0742102
98.4%
112370
 
1.6%

Most occurring scripts

ValueCountFrequency (%)
Common754472
100.0%

Most frequent character per script

ValueCountFrequency (%)
0742102
98.4%
112370
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII754472
100.0%

Most frequent character per block

ValueCountFrequency (%)
0742102
98.4%
112370
 
1.6%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size41.7 MiB
0
745389 
1
 
9083

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters754472
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0745389
98.8%
19083
 
1.2%
2021-08-09T17:15:45.499352image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:45.567679image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0745389
98.8%
19083
 
1.2%

Most occurring characters

ValueCountFrequency (%)
0745389
98.8%
19083
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number754472
100.0%

Most frequent character per category

ValueCountFrequency (%)
0745389
98.8%
19083
 
1.2%

Most occurring scripts

ValueCountFrequency (%)
Common754472
100.0%

Most frequent character per script

ValueCountFrequency (%)
0745389
98.8%
19083
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII754472
100.0%

Most frequent character per block

ValueCountFrequency (%)
0745389
98.8%
19083
 
1.2%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size50.3 MiB
Reconstruction
465543 
Major repair
128086 
Minor repair
109497 
No need
51346 

Length

Max length14
Median length14
Mean length12.89381183
Min length7

Characters and Unicode

Total characters9728020
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo need
2nd rowReconstruction
3rd rowReconstruction
4th rowReconstruction
5th rowNo need
ValueCountFrequency (%)
Reconstruction465543
61.7%
Major repair128086
 
17.0%
Minor repair109497
 
14.5%
No need51346
 
6.8%
2021-08-09T17:15:45.745982image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:45.826882image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
reconstruction465543
44.6%
repair237583
22.8%
major128086
 
12.3%
minor109497
 
10.5%
need51346
 
4.9%
no51346
 
4.9%

Most occurring characters

ValueCountFrequency (%)
o1220015
12.5%
r1178292
12.1%
n1091929
11.2%
c931086
9.6%
t931086
9.6%
i812623
8.4%
e805818
8.3%
R465543
 
4.8%
s465543
 
4.8%
u465543
 
4.8%
Other values (7)1360542
14.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter8684619
89.3%
Uppercase Letter754472
 
7.8%
Space Separator288929
 
3.0%

Most frequent character per category

ValueCountFrequency (%)
o1220015
14.0%
r1178292
13.6%
n1091929
12.6%
c931086
10.7%
t931086
10.7%
i812623
9.4%
e805818
9.3%
s465543
 
5.4%
u465543
 
5.4%
a365669
 
4.2%
Other values (3)417015
 
4.8%
ValueCountFrequency (%)
R465543
61.7%
M237583
31.5%
N51346
 
6.8%
ValueCountFrequency (%)
288929
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin9439091
97.0%
Common288929
 
3.0%

Most frequent character per script

ValueCountFrequency (%)
o1220015
12.9%
r1178292
12.5%
n1091929
11.6%
c931086
9.9%
t931086
9.9%
i812623
8.6%
e805818
8.5%
R465543
 
4.9%
s465543
 
4.9%
u465543
 
4.9%
Other values (6)1071613
11.4%
ValueCountFrequency (%)
288929
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII9728020
100.0%

Most frequent character per block

ValueCountFrequency (%)
o1220015
12.5%
r1178292
12.1%
n1091929
11.2%
c931086
9.6%
t931086
9.6%
i812623
8.4%
e805818
8.3%
R465543
 
4.8%
s465543
 
4.8%
u465543
 
4.8%
Other values (7)1360542
14.0%

damage_grade
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size46.0 MiB
Grade 5
273008 
Grade 4
182006 
Grade 3
135048 
Grade 2
86384 
Grade 1
78026 

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters5281304
Distinct characters11
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGrade 1
2nd rowGrade 5
3rd rowGrade 5
4th rowGrade 5
5th rowGrade 1
ValueCountFrequency (%)
Grade 5273008
36.2%
Grade 4182006
24.1%
Grade 3135048
17.9%
Grade 286384
 
11.4%
Grade 178026
 
10.3%
2021-08-09T17:15:46.052818image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-08-09T17:15:46.126327image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
grade754472
50.0%
5273008
 
18.1%
4182006
 
12.1%
3135048
 
8.9%
286384
 
5.7%
178026
 
5.2%

Most occurring characters

ValueCountFrequency (%)
G754472
14.3%
r754472
14.3%
a754472
14.3%
d754472
14.3%
e754472
14.3%
754472
14.3%
5273008
 
5.2%
4182006
 
3.4%
3135048
 
2.6%
286384
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3017888
57.1%
Uppercase Letter754472
 
14.3%
Space Separator754472
 
14.3%
Decimal Number754472
 
14.3%

Most frequent character per category

ValueCountFrequency (%)
5273008
36.2%
4182006
24.1%
3135048
17.9%
286384
 
11.4%
178026
 
10.3%
ValueCountFrequency (%)
r754472
25.0%
a754472
25.0%
d754472
25.0%
e754472
25.0%
ValueCountFrequency (%)
G754472
100.0%
ValueCountFrequency (%)
754472
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin3772360
71.4%
Common1508944
 
28.6%

Most frequent character per script

ValueCountFrequency (%)
754472
50.0%
5273008
 
18.1%
4182006
 
12.1%
3135048
 
8.9%
286384
 
5.7%
178026
 
5.2%
ValueCountFrequency (%)
G754472
20.0%
r754472
20.0%
a754472
20.0%
d754472
20.0%
e754472
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII5281304
100.0%

Most frequent character per block

ValueCountFrequency (%)
G754472
14.3%
r754472
14.3%
a754472
14.3%
d754472
14.3%
e754472
14.3%
754472
14.3%
5273008
 
5.2%
4182006
 
3.4%
3135048
 
2.6%
286384
 
1.6%

Interactions

2021-08-09T17:14:59.057250image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:14:59.369297image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:14:59.637866image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:14:59.917993image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:00.255675image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:00.508336image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:00.775245image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:01.039264image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:01.324231image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:01.588663image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:01.851803image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:02.104649image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:02.357003image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:02.605642image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:02.842813image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:03.095092image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:03.345126image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:03.604056image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:03.853615image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:04.110724image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:04.358353image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:04.601910image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:04.849371image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:05.083552image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:05.333156image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:05.582059image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:05.836441image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:06.084039image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:06.340754image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:06.590136image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:06.838911image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:07.084236image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:07.320623image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:07.571240image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:07.819789image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:08.073120image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:08.319558image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:08.579209image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:08.830483image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:09.079812image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:09.324727image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:09.562099image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:09.815863image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:10.065016image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:10.321353image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:10.570571image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:10.843841image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:11.091680image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:11.352870image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:11.597630image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:11.831382image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:12.082017image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:12.328293image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:12.578999image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:12.831471image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:13.108465image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:13.380183image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:13.644796image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:13.910789image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:14.169405image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:14.431486image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:14.703442image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:14.975893image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:15.239308image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:15.500344image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:15.754230image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:16.010499image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:16.260191image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:16.519381image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:16.760017image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:17.016077image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:17.275132image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:17.527506image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:17.798265image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:18.058689image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:18.313878image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:18.570367image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:18.820998image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:19.063158image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:19.325450image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:19.581359image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:19.833633image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:20.094861image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:20.343639image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:20.597134image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:20.847361image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:21.094078image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:21.341276image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:22.000083image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-09T17:15:22.249996image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-08-09T17:15:46.273069image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-08-09T17:15:46.747116image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-08-09T17:15:47.199015image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-08-09T17:15:47.672106image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-08-09T17:15:48.194679image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-08-09T17:15:24.508872image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-08-09T17:15:27.955250image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexbuilding_iddistrict_idvdcmun_idward_idlegal_ownership_statuscount_familieshas_secondary_usehas_secondary_use_agriculturehas_secondary_use_hotelhas_secondary_use_rentalhas_secondary_use_institutionhas_secondary_use_schoolhas_secondary_use_industryhas_secondary_use_health_posthas_secondary_use_gov_officehas_secondary_use_use_policehas_secondary_use_othercount_floors_pre_eqage_buildingplinth_area_sq_ftheight_ft_pre_eqland_surface_conditionfoundation_typeroof_typeground_floor_typeother_floor_typepositionplan_configurationhas_superstructure_adobe_mudhas_superstructure_mud_mortar_stonehas_superstructure_stone_flaghas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_cement_mortar_brickhas_superstructure_timberhas_superstructure_bamboohas_superstructure_rc_non_engineeredhas_superstructure_rc_engineeredhas_superstructure_othertechnical_solution_proposeddamage_grade
0632841312004091341313104310404Private1.00.000000000001772010FlatRCRCC/RB/RBCRCNot applicableNot attachedRectangular00000000010No needGrade 1
1319471240402000391242408240802Private1.01.0100000000032031521Moderate slopeMud mortar-Stone/BrickBamboo/Timber-Light roofMudTImber/Bamboo-MudAttached-1 sideRectangular01000000000ReconstructionGrade 5
2490564286202000091282812281201Private1.00.0000000000023838213FlatMud mortar-Stone/BrickBamboo/Timber-Heavy roofMudTImber/Bamboo-MudNot attachedRectangular01000000000ReconstructionGrade 5
3215031224202001381222201220106Private1.00.0000000000028020012FlatMud mortar-Stone/BrickBamboo/Timber-Heavy roofMudTImber/Bamboo-MudNot attachedRectangular01000000000ReconstructionGrade 5
4156516214508000132212106210608Private0.00.000000000002046513FlatMud mortar-Stone/BrickBamboo/Timber-Light roofMudTImber/Bamboo-MudNot attachedRectangular01000000000No needGrade 1
5187577221701001641222201220101Private1.00.0000000000031065121FlatMud mortar-Stone/BrickBamboo/Timber-Heavy roofMudTImber/Bamboo-MudNot attachedRectangular01000011000Major repairGrade 3
658192202005000561202009200904Private1.00.000000000002230212Moderate slopeBamboo/TimberBamboo/Timber-Heavy roofMudTimber-PlanckNot attachedRectangular01000010000No needGrade 3
7503026291802020431292904290402Private1.01.0010000000026039815FlatMud mortar-Stone/BrickBamboo/Timber-Light roofMudTimber-PlanckAttached-2 sideRectangular01000000000ReconstructionGrade 5
8552861302804001931303001300102Private1.00.0000000000032226318FlatMud mortar-Stone/BrickBamboo/Timber-Light roofMudTImber/Bamboo-MudNot attachedRectangular01000000000ReconstructionGrade 5
9382732246404000241242408240803Private1.00.0000000000022015514FlatMud mortar-Stone/BrickBamboo/Timber-Light roofMudTImber/Bamboo-MudNot attachedRectangular01000000000ReconstructionGrade 2

Last rows

df_indexbuilding_iddistrict_idvdcmun_idward_idlegal_ownership_statuscount_familieshas_secondary_usehas_secondary_use_agriculturehas_secondary_use_hotelhas_secondary_use_rentalhas_secondary_use_institutionhas_secondary_use_schoolhas_secondary_use_industryhas_secondary_use_health_posthas_secondary_use_gov_officehas_secondary_use_use_policehas_secondary_use_othercount_floors_pre_eqage_buildingplinth_area_sq_ftheight_ft_pre_eqland_surface_conditionfoundation_typeroof_typeground_floor_typeother_floor_typepositionplan_configurationhas_superstructure_adobe_mudhas_superstructure_mud_mortar_stonehas_superstructure_stone_flaghas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_cement_mortar_brickhas_superstructure_timberhas_superstructure_bamboohas_superstructure_rc_non_engineeredhas_superstructure_rc_engineeredhas_superstructure_othertechnical_solution_proposeddamage_grade
754462582560304208000151303009300913Private1.00.0000000000022824716FlatMud mortar-Stone/BrickBamboo/Timber-Heavy roofMudTImber/Bamboo-MudNot attachedRectangular01000000000ReconstructionGrade 4
754463398019247109001151242401240105Private1.00.0000000000021856715FlatMud mortar-Stone/BrickBamboo/Timber-Light roofMudTImber/Bamboo-MudAttached-1 sideRectangular11000010000ReconstructionGrade 5
754464193783222304000871222204220404Private1.00.000000000003834718Moderate slopeMud mortar-Stone/BrickBamboo/Timber-Light roofMudTimber-PlanckNot attachedRectangular01000000000ReconstructionGrade 5
75446597897204403000801202007200706Private1.00.0000000000021448621FlatMud mortar-Stone/BrickBamboo/Timber-Light roofMudTImber/Bamboo-MudNot attachedRectangular01000011000ReconstructionGrade 4
754466397094247101001681242405240501Private1.00.000000000002749516FlatMud mortar-Stone/BrickBamboo/Timber-Light roofMudTImber/Bamboo-MudNot attachedRectangular01000000000ReconstructionGrade 5
754467191690222105000421222209220903Private1.00.000000000003823021FlatMud mortar-Stone/BrickBamboo/Timber-Heavy roofMudTImber/Bamboo-MudAttached-1 sideRectangular01000000000ReconstructionGrade 4
754468615995311409000571313102310202Private0.00.000000000002245012FlatBamboo/TimberBamboo/Timber-Light roofMudTimber-PlanckNot attachedRectangular00000010000Minor repairGrade 3
75446933748124903000611121208120803Private1.00.0000000000013025310FlatMud mortar-Stone/BrickBamboo/Timber-Light roofMudNot applicableNot attachedRectangular01000000000ReconstructionGrade 5
754470418013280407000882282801280111Private1.00.0000000000023540015Moderate slopeMud mortar-Stone/BrickBamboo/Timber-Heavy roofMudTimber-PlanckAttached-1 sideRectangular01000000000ReconstructionGrade 4
754471607602311004001611313111311109Private0.00.0000000000028021017FlatMud mortar-Stone/BrickBamboo/Timber-Light roofMudTImber/Bamboo-MudAttached-1 sideRectangular01000000000Major repairGrade 3